FILTER MODE ACTIVE

#speech recognition

Records found: 10

#speech recognition07/01/2026

NVIDIA Unveils Nemotron ASR for Low-Latency Applications

Explore NVIDIA's new Nemotron Speech ASR model designed for voice agents and live captioning with low-latency performance.

READ →

#speech recognition10/09/2025

Boost ASR Accuracy with SpeechBrain: Build a Denoise + Recognition Pipeline in Python

A hands-on guide to building a compact pipeline with SpeechBrain that generates speech, adds noise, enhances audio with MetricGAN+, and measures ASR word error rates before and after denoising

READ →

#speech recognition09/09/2025

Qwen3-ASR Flash: Alibaba's Single-Model Leap in Multilingual, Noise-Robust Speech Recognition

'Qwen3-ASR Flash from Alibaba is a single-model ASR that auto-detects and transcribes 11 languages, supports context injection for domain terms, and keeps WER below 8% in noisy or musical audio.'

READ →

#speech recognition04/09/2025

OLMoASR: AI2’s Open ASR Suite Challenging OpenAI Whisper

'AI2 released OLMoASR, an open ASR suite that includes models, training data identifiers, filtering recipes, and benchmarks, and competes closely with OpenAI Whisper across multiple tasks.'

READ →

#speech recognition30/08/2025

Voice AI 2025: 20 Essential Blogs and News Sites to Follow

'A concise guide to the 20 best voice AI blogs and news sites for 2025, covering research, product launches, ethics, and market trends to help developers and leaders stay informed.'

READ →

#speech recognition29/08/2025

OpenAI Unveils GPT-Realtime: Unified Speech-to-Speech with SIP Calling and MCP Support

'OpenAI released GPT-Realtime and Realtime API with unified audio processing, SIP phone support and MCP server integration, improving performance and enterprise deployment options while key speech AI challenges remain.'

READ →

#speech recognition29/07/2025

Amazon Unveils AI Architecture Slashing Inference Time by 30% Through Selective Neuron Activation

Amazon researchers created an AI architecture that cuts inference time by 30% by activating only task-relevant neurons, inspired by the brain's efficient processing.

READ →

#speech recognition17/07/2025

NVIDIA Launches Canary-Qwen-2.5B: The Leading ASR-LLM Hybrid Model with Unmatched Accuracy and Speed

NVIDIA's Canary-Qwen-2.5B model sets a new benchmark in speech recognition with a record low Word Error Rate and fast processing speed. This open-source, commercially licensed hybrid ASR-LLM model enables advanced audio transcription and language understanding.

READ →

#speech recognition17/07/2025

Mistral AI Unveils Voxtral: Leading Open-Source Speech Recognition Models with Advanced Audio Understanding

Mistral AI launches Voxtral, cutting-edge open-weight speech recognition models that integrate transcription and language understanding with support for long audio contexts and multiple languages.

READ →

#speech recognition06/05/2025

LLaMA-Omni2: China’s Breakthrough in Real-Time Speech-Enabled Large Language Models

Chinese researchers release LLaMA-Omni2, a modular speech language model that enables real-time spoken dialogue with minimal latency and strong performance using compact training data.

READ →